132 research outputs found

    A New Identity for the Least-square Solution of Overdetermined Set of Linear Equations

    Get PDF
    In this paper, we prove a new identity for the least-square solution of an over-determined set of linear equation Ax=bAx=b, where AA is an m×nm\times n full-rank matrix, bb is a column-vector of dimension mm, and mm (the number of equations) is larger than or equal to nn (the dimension of the unknown vector xx). Generally, the equations are inconsistent and there is no feasible solution for xx unless bb belongs to the column-span of AA. In the least-square approach, a candidate solution is found as the unique xx that minimizes the error function ∄Ax−b∄2\|Ax-b\|_2. We propose a more general approach that consist in considering all the consistent subset of the equations, finding their solutions, and taking a weighted average of them to build a candidate solution. In particular, we show that by weighting the solutions with the squared determinant of their coefficient matrix, the resulting candidate solution coincides with the least square solution

    Ad Hoc Microphone Array Calibration: Euclidean Distance Matrix Completion Algorithm and Theoretical Guarantees

    Get PDF
    This paper addresses the problem of ad hoc microphone array calibration where only partial information about the distances between microphones is available. We construct a matrix consisting of the pairwise distances and propose to estimate the missing entries based on a novel Euclidean distance matrix completion algorithm by alternative low-rank matrix completion and projection onto the Euclidean distance space. This approach confines the recovered matrix to the EDM cone at each iteration of the matrix completion algorithm. The theoretical guarantees of the calibration performance are obtained considering the random and locally structured missing entries as well as the measurement noise on the known distances. This study elucidates the links between the calibration error and the number of microphones along with the noise level and the ratio of missing distances. Thorough experiments on real data recordings and simulated setups are conducted to demonstrate these theoretical insights. A significant improvement is achieved by the proposed Euclidean distance matrix completion algorithm over the state-of-the-art techniques for ad hoc microphone array calibration.Comment: In Press, available online, August 1, 2014. http://www.sciencedirect.com/science/article/pii/S0165168414003508, Signal Processing, 201

    Theoretical Analysis of Euclidean Distance Matrix Completion for Ad hoc Microphone Array Calibration

    Get PDF
    We consider the problem of ad~hoc microphone array calibration where the distance matrix consisted of all microphones pairwise distances have entries missing corresponding to distances greater than dmaxd_{\text{max}}. Furthermore, the known entries are noisy modeled through additive independent random variables with strictly sub-Gaussian distribution, \textsc{S}\textsc{ub}(c^2(d)) with a bounded constant dependent on the distance dd between the microphone pairs. In this report, we exploit matrix completion approach to recover the full distance matrix. We derive the theoretical guarantees of microphone calibration performance which demonstrates that the error of calibrating a network of NN microphones using matrix completion decreases as O(N−1/2)\mathcal{O}(N^{-1/2})

    BROADBAND BEAMPATTERN FOR MULTI-CHANNEL SPEECH ACQUISITION AND DISTANT SPEECH RECOGNITION

    Get PDF
    Spatial filtering is the fundamental characteristic of microphone array based signal acquisition which plays an important role in applications such as speech enhancement and distant speech recognition. In the array processing literature, this property is formulated upon beam-pattern steering and it is characterized for narrowband signals. This paper proposes to characterize the microphone array broadband beam-pattern based on the average output of a steered beamformer for a broadband spectrum. Relying on this characterization, we derive the directivity beam-pattern of delay-and-sum and superdirective beamformers for a linear as well as a circular microphone array. We further investigate how the broadband beam-pattern is linked to speech recognition feature extraction; hence, it can be used to evaluate distant speech recognition performance. The proposed theory is demonstrated with experiments on real data recordings

    Enhanced Diffuse Field Model for Ad Hoc Microphone Array Calibration

    Get PDF
    In this paper, we investigate the diffuse field coherence model for microphone array pairwise distance estimation. We study the fundamental constraints and assumptions underlying this approach and propose evaluation methodologies to measure the adequacy of diffuseness for microphone array calibration. In addition, an enhanced scheme based on coherence averaging and histogramming, is presented to improve the robustness and performance of the pairwise distance estimation approach. The proposed theories and algorithms are evaluated on simulated and real data recordings for calibration of microphone array geometry in an ad hoc set-up

    Microphone Array Beampattern Characterization for Hands-free Speech Applications

    Get PDF
    Spatial filtering is the fundamental characteristic of microphone array based signal acquisition, which plays an important role in applications such as speech enhancement and distant speech recognition. In the array processing literature, this property is formulated upon beam-pattern steering and it is characterized for narrowband signals. This paper proposes to characterize the microphone array broadband beam-pattern based on the average output of a steered beamformer for a broadband spectrum. Relying on this characterization, we derive the directivity beam-pattern of delayand- sum and superdirective beamformers for a linear as well as a circular microphone array. We further investigate how the broadband beam-pattern is linked to speech recognition feature extraction; hence, it can be used to evaluate distant speech recognition performance. The proposed theory is demonstrated with experiments on real data recording

    COMBINING CEPSTRAL NORMALIZATION AND COCHLEAR IMPLANT-LIKE SPEECH PROCESSING FOR MICROPHONE ARRAY-BASED SPEECH RECOGNITION

    Get PDF
    This paper investigates the combination of cepstral normalization and cochlear implant-like speech processing for microphone array- based speech recognition. Testing speech signals are recorded by a circular microphone array and are subsequently processed with superdirective beamforming and McCowan post-filtering. Training speech signals, from the multichannel overlapping Number corpus (MONC), are clean and not overlapping. Cochlear implant-like speech processing, which is inspired from the speech processing strategy in cochlear implants, is applied on the training and testing speech signals. Cepstral normalization, including cepstral mean and variance normalization (CMN and CVN), are applied on the training and testing cepstra. Experiments show that implementing either cepstral normalization or cochlear implant-like speech pro- cessing helps in reducing the WERs of microphone array-based speech recognition. Combining cepstral normalization and cochlear implant-like speech processing reduces further the WERs, when there is overlapping speech. Train/test mismatches are measured using the Kullback-Leibler divergence (KLD), between the global probability density functions (PDFs) of training and testing cepstral vectors. This measure reveals a train/test mismatch reduction when either cepstral normalization or cochlear implant-like speech pro- cessing is used. It reveals also that combining these two processing reduces further the train/test mismatches as well as the WERs

    IMPROVING MICROPHONE ARRAY SPEECH RECOGNITION WITH COCHLEAR IMPLANT-LIKE SPECTRALLY REDUCED SPEECH

    Get PDF
    Cochlear implant-like spectrally reduced speech (SRS) has previously been shown to afford robustness to additive noise. In this paper, it is evaluated in the context of microphone array based automatic speech recognition (ASR). It is compared to and combined with post-filter and cepstral normalisation techniques. When there is no overlapping speech, the combination of cepstral normalization and the SRS-based ASR framework gives a performance comparable with the best obtained with a non-SRS baseline system, using maximum a posteriori (MAP) adaptation, either on microphone array signal or lapel microphone signal. When there is overlapping speech from competing speakers, the same combination gives significantly better word error rates compared to the best ones obtained with the previously published baseline system. Experiments are performed with the MONC database and HTK toolkit

    Computational Methods for Underdetermined Convolutive Speech Localization and Separation via Model-based Sparse Component Analysis

    Get PDF
    In this paper, the problem of speech source localization and separation from recordings of convolutive underdetermined mixtures is studied. The problem is cast as recovering the spatio-spectral speech information embedded in a microphone array compressed measurements of the acoustic field. A model-based sparse component analysis framework is formulated for sparse reconstruction of the speech spectra in a reverberant acoustic resulting in joint localization and separation of the individual sources. We compare and contrast the computational approaches to model-based sparse recovery exploiting spatial sparsity as well as spectral structures underlying spectrographic representation of speech signals. In this context, we explore identification of the sparsity structures at the auditory and acoustic representation spaces. The auditory structures are formulated upon the principles of structural grouping based on proximity, autoregressive correlation and harmonicity of the spectral coefficients and they are incorporated for sparse reconstruction. The acoustic structures are formulated upon the image model of multipath propagation and they are exploited to characterize the compressive measurement matrix associated with microphone array recordings. Three approaches to sparse recovery relying on combinatorial optimization, convex relaxation and Bayesian methods are studied and evaluated based on thorough experiments. The sparse Bayesian learning method is shown to yield better perceptual quality while the interference suppression is also achieved using the combinatorial approach with the advantage of offering the most efficient computational cost. Furthermore, it is demonstrated that an average autoregressive model can be learned for speech localization and exploiting the proximity structure in the form of block sparse coefficients enables accurate localization. Throughout the extensive empirical evaluation, we confirm that a large and random placement of the microphones enables significant improvement in source localization and separation performance
    • 

    corecore